SONARJAVA-6524 Generate built-in profiles from rule metadata#5705
SONARJAVA-6524 Generate built-in profiles from rule metadata#5705romainbrenguier wants to merge 9 commits into
Conversation
c0ca171 to
f826968
Compare
6c724fb to
4ebb9c8
Compare
|
Thanks for the comprehensive review. I've addressed both remaining suggestions:
Both changes improve the build configuration clarity and help prevent silent profile-membership mistakes. |
12e4ea6 to
525fdec
Compare
| private static final String PROFILES_RELATIVE_PATH = "src/main/resources/profiles"; | ||
|
|
||
| private static final Map<String, String> PROFILES = Map.of( | ||
| "sonar_way", "Sonar way", | ||
| "sonar_agentic_ai", "Sonar agentic AI" | ||
| ); | ||
|
|
||
| public static void main(String[] args) throws IOException { | ||
| if (args.length != 2) { | ||
| throw new IllegalArgumentException("Expected module and output directories as arguments."); | ||
| } | ||
|
|
||
| Path profilesDirectory = Path.of(args[0]).resolve(PROFILES_RELATIVE_PATH); | ||
| Path outputDirectory = Path.of(args[1]).resolve("org/sonar/l10n/java/rules/java"); | ||
|
|
There was a problem hiding this comment.
⚠️ Bug: S8914 Sonar way marker placed in wrong directory
The new rule S8914 gets its Sonar way profile membership from an empty marker file, but the file was created at .../rules/java/profiles/Sonar_way/S8914 instead of the directory the build actually reads.
ProfileJsonGenerator composes profiles by listing src/main/resources/profiles/<profileKey> where profileKey is sonar_way/sonar_agentic_ai (see PROFILES_RELATIVE_PATH = "src/main/resources/profiles" and the loop at lines 46-55). The README confirms rules must be added via src/main/resources/profiles/sonar_way/<RuleKey>. Compare with the sibling change in this same PR that correctly added profiles/sonar_way/S8948.
The location used for S8914 (org/sonar/l10n/java/rules/java/profiles/Sonar_way/, capital Sonar_way) is not scanned by the generator — that directory now contains only S8914. As a result S8914 will be omitted from the generated Sonar_way_profile.json, so the new VULNERABILITY rule will not be enabled in the default Sonar way profile despite the marker suggesting that was the intent.
Fix: move the marker to the directory the generator reads (and delete the stray one).
Place the S8914 profile marker in resources/profiles/sonar_way and remove the misplaced one.:
# create the marker in the directory ProfileJsonGenerator scans
touch sonar-java-plugin/src/main/resources/profiles/sonar_way/S8914
# remove the stray marker in the wrong location
rm sonar-java-plugin/src/main/resources/org/sonar/l10n/java/rules/java/profiles/Sonar_way/S8914
- Apply fix
Check the box to apply the fix or reply for a change | Was this helpful? React with 👍 / 👎
Code Review
|
| Auto-apply | Compact | Unblock |
|
|
|
Was this helpful? React with 👍 / 👎 | Gitar
|
Replace metadata-based profile generation with directory-based approach. Each rule's profile membership is now represented by a file in profile-specific directories (profiles/sonar_way/, profiles/sonar_agentic_ai/). This eliminates merge conflicts when parallel PRs add rules to profiles, as each PR creates a new file instead of editing a shared JSON array. Changes: - Add ProfileJsonGenerator to scan profile directories and generate JSONs - Create profile directories with 534 (Sonar way) and 467 (Agentic AI) rule files - Update pom.xml to generate and copy profiles during build - Add README.md with usage instructions
…7) to match the 468 files in sonar_agentic_ai profile directory; updated MetadataTest to read generated Sonar_way_profile.json from target/classes/ instead of src/main/resources/ since it is now generated during the build
Comment: <details> <summary><b>Code Review</b> <kbd>👍 Approved with suggestions</kbd> <kbd>5 resolved / 7 findings</kbd></summary> Automates built-in profile generation by moving rule membership metadata into individual rule files, resolving issues with stale JSON tracking and brittle manual updates. Consolidate the duplicate copy operations in the build configuration and refine the rule-key validation logic to prevent silent file drops. <details> <summary>💡 <b>Quality:</b> Generated profiles copied twice via <resources> and copy-resources</summary> <kbd>📄 <a href="https://github.com/SonarSource/sonar-java/pull/5705/files#diff-a2a59812e774224a494679a03de77f5fe24ceb84295e379d6b9583ef97a1ee15R148-R155">sonar-java-plugin/pom.xml:148-155</a></kbd> <kbd>📄 <a href="https://github.com/SonarSource/sonar-java/pull/5705/files#diff-a2a59812e774224a494679a03de77f5fe24ceb84295e379d6b9583ef97a1ee15R397-R411">sonar-java-plugin/pom.xml:397-411</a></kbd> The build both declares `${project.build.directory}/generated-resources/profiles` as a `<resource>` directory (which the default `process-resources` execution already copies into `${project.build.outputDirectory}`) and adds a separate `copy-generated-profiles` maven-resources-plugin execution that copies the same directory to the same `outputDirectory`. The two mechanisms are redundant. Keeping only one (the `<resources>` entry is sufficient) would reduce confusion and avoid double-processing the same files. <details> <summary>Fix</summary> ```` <!-- Remove the redundant copy-generated-profiles execution; the <resources> entry for generated-resources/profiles already copies the files into ${project.build.outputDirectory} during process-resources. --> ```` </details> </details> <details> <summary>💡 <b>Quality:</b> Misnamed rule-key files are silently dropped from profiles</summary> <kbd>📄 <a href="https://github.com/SonarSource/sonar-java/pull/5705/files#diff-527d6d3ff6d0b2988ebdcb2fe8ecc63ce3bf3ce782e105cb2ddd1881b66929edR67">sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:67</a></kbd> <kbd>📄 <a href="https://github.com/SonarSource/sonar-java/pull/5705/files#diff-527d6d3ff6d0b2988ebdcb2fe8ecc63ce3bf3ce782e105cb2ddd1881b66929edR73-R76">sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:73-76</a></kbd> `collectRuleKeys` filters profile-directory entries with `isValidRuleKey` (`S\d+`). Any file that does not exactly match — e.g. a typo like `s106` (lowercase), `S106 ` (trailing space), or `S106.txt` — is silently skipped, so the corresponding rule disappears from the generated profile with no error or warning. Given the whole design relies on humans creating empty files named after rule keys, a silent drop makes profile-membership mistakes hard to detect. Consider logging a warning for files in a profile directory that do not match the expected rule-key pattern (excluding known files such as README/.gitignore). <details> <summary>Fix</summary> ```` files .filter(Files::isRegularFile) .map(Path::getFileName) .map(Path::toString) .peek(name -> { if (!isValidRuleKey(name)) { System.err.println("Ignoring non-rule-key file in profile directory: " + name); } }) .filter(ProfileJsonGenerator::isValidRuleKey) .sorted(Comparator.comparingInt(ProfileJsonGenerator::numericKey)) .collect(Collectors.toList()); ```` </details> </details> <details> <summary><kbd>✅ 5 resolved</kbd></summary> <details> <summary>✅ <b>Quality:</b> Profile generator silently drops rules with unknown profile names</summary> > <kbd>📄 sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:64-72</kbd> <kbd>📄 sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:84-97</kbd> > `collectKeysByProfile` looks up each profile name extracted from a rule's `defaultQualityProfiles` with `keysByProfile.get(profile)` and only adds the rule key when the returned list is non-null. Any profile name that is not exactly one of the two keys in `PROFILES` ("Sonar way", "Sonar agentic AI") is therefore silently ignored. > > This migration moves profile membership into ~500 hand-edited rule metadata files, so a typo such as "Sonar Way", "sonar way", or "Sonar agentic Al" in any single rule would silently exclude that rule from the built-in profile with no error. The safety nets are weak: `MetadataTest.ensure_sane_Sonar_way_profile` only asserts the Sonar way size is `> 400`, so a handful of dropped rules would go completely unnoticed (the agentic test uses an exact size, but Sonar way does not). Likewise, a rule whose JSON omits `defaultQualityProfiles` entirely is silently excluded. > > Recommend failing the build (or at minimum warning) when a rule references a profile name that is not in `PROFILES`, so accidental omissions surface at build time instead of shipping an incomplete profile. </details> <details> <summary>✅ <b>Quality:</b> Regex-based JSON parsing in ProfileJsonGenerator is fragile</summary> > <kbd>📄 sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:33-35</kbd> <kbd>📄 sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:84-97</kbd> <kbd>📄 sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:99-105</kbd> > `ProfileJsonGenerator` extracts `sqKey` and `defaultQualityProfiles` via hand-written regular expressions rather than a JSON parser. This works for the current well-formatted metadata, but it is brittle: `JSON_STRING_PATTERN` blindly captures every quoted token inside the `defaultQualityProfiles` array, so any future change such as an inline comment, an escaped quote, or reformatting could yield wrong profile names or miss entries. Because the generator runs as a single-file source launch (`java ProfileJsonGenerator.java`) it cannot easily depend on Gson; however the fragility is worth a comment and tight patterns. Consider at least documenting the assumption that metadata files are machine-generated and strictly formatted, and validating extracted profile names against the known set (see related finding) so malformed input cannot silently produce an incorrect profile. </details> <details> <summary>✅ <b>Bug:</b> Stale source profile JSONs collide with generated ones</summary> > <kbd>📄 sonar-java-plugin/pom.xml:148-155</kbd> <kbd>📄 sonar-java-plugin/pom.xml:397-411</kbd> <kbd>📄 sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:42</kbd> <kbd>📄 sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:56-57</kbd> > The PR's stated goal is to "stop tracking the generated profile JSONs," but the old hand-maintained files are still present in source: `sonar-java-plugin/src/main/resources/org/sonar/l10n/java/rules/java/Sonar_way_profile.json` and `Sonar_agentic_AI_profile.json` (the diff shows 0 deletions). `ProfileJsonGenerator` now writes freshly generated files to the SAME packaged path (`org/sonar/l10n/java/rules/java/Sonar_way_profile.json`). > > In the pom, both `src/main/resources` and `${project.build.directory}/generated-resources/profiles` are declared as resource directories (lines 148-155), and there is also a `copy-generated-profiles` copy-resources execution. Both the stale src copy and the generated copy resolve to the identical target path in `target/classes`. Which one ends up packaged depends entirely on maven-resources-plugin copy ordering and its `overwrite` timestamp semantics (by default a resource is only copied when the source is newer than the destination). This is fragile: the plugin may ship the stale, hand-maintained profile instead of the generated one, and at minimum the two definitions can silently diverge while both remain authoritative-looking. > > Delete the old `Sonar_way_profile.json` / `Sonar_agentic_AI_profile.json` from `src/main/resources` so the generated artifact is the single source of truth, and ensure the per-rule profile membership files fully reproduce the previous profile contents. </details> <details> <summary>✅ <b>Edge Case:</b> numericKey throws cryptic NumberFormatException on stray files</summary> > <kbd>📄 sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:61-75</kbd> > `collectRuleKeys` lists every regular file in a profile directory and feeds each filename to `numericKey`, which does `Integer.parseInt(ruleKey.substring(1))`. Any file whose name is not exactly `S<digits>` — e.g. a `.gitkeep`, `.DS_Store`, editor swap file, or a typo'd rule key such as `S891O` (letter O) — causes a `NumberFormatException` that aborts the build with an opaque message ("For input string ...") and no indication of the offending directory/file. > > Consider filtering to files matching `S\d+` (and/or sorting with a fallback comparator) and throwing a descriptive error that names the bad file, so contributors immediately understand the problem. </details> <details> <summary>✅ <b>Bug:</b> MetadataTest reads deleted src/main/resources profile JSON</summary> > <kbd>📄 sonar-java-plugin/src/main/resources/org/sonar/l10n/java/rules/java/.gitignore:1</kbd> > This PR deletes `src/main/resources/org/sonar/l10n/java/rules/java/Sonar_way_profile.json` (and the agentic one) and adds a `.gitignore` for `*_profile.json`, so the profile JSONs now only exist as generated artifacts under `target/generated-resources` / `target/classes`. However `MetadataTest.ensure_sane_Sonar_way_profile()` still reads the profile via a hard-coded filesystem path: `Path.of("src/main/resources/" + JavaSonarWayProfile.SONAR_WAY_PATH)` and opens it with `Files.newReader(profilePath.toFile(), ...)`. Since that file no longer exists in the source tree, the test will fail with FileNotFoundException. This test is explicitly listed in the PR's test command (`-Dtest=MetadataTest,...`). The PR description says tests should be updated 'to validate the generated classpath resources instead of src/main/resources files', but MetadataTest was not updated. Point the test at the generated output (e.g. `target/classes` + SONAR_WAY_PATH) or load it from the classpath via `getResourceAsStream(SONAR_WAY_PATH)`. </details> </details> <details> <summary>🤖 <b>Prompt for agents</b></summary> ```` Code Review: Automates built-in profile generation by moving rule membership metadata into individual rule files, resolving issues with stale JSON tracking and brittle manual updates. Consolidate the duplicate copy operations in the build configuration and refine the rule-key validation logic to prevent silent file drops. 1. 💡 Quality: Generated profiles copied twice via <resources> and copy-resources Files: sonar-java-plugin/pom.xml:148-155, sonar-java-plugin/pom.xml:397-411 The build both declares `${project.build.directory}/generated-resources/profiles` as a `<resource>` directory (which the default `process-resources` execution already copies into `${project.build.outputDirectory}`) and adds a separate `copy-generated-profiles` maven-resources-plugin execution that copies the same directory to the same `outputDirectory`. The two mechanisms are redundant. Keeping only one (the `<resources>` entry is sufficient) would reduce confusion and avoid double-processing the same files. Fix: <!-- Remove the redundant copy-generated-profiles execution; the <resources> entry for generated-resources/profiles already copies the files into ${project.build.outputDirectory} during process-resources. --> 2. 💡 Quality: Misnamed rule-key files are silently dropped from profiles Files: sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:67, sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:73-76 `collectRuleKeys` filters profile-directory entries with `isValidRuleKey` (`S\d+`). Any file that does not exactly match — e.g. a typo like `s106` (lowercase), `S106 ` (trailing space), or `S106.txt` — is silently skipped, so the corresponding rule disappears from the generated profile with no error or warning. Given the whole design relies on humans creating empty files named after rule keys, a silent drop makes profile-membership mistakes hard to detect. Consider logging a warning for files in a profile directory that do not match the expected rule-key pattern (excluding known files such as README/.gitignore). Fix: files .filter(Files::isRegularFile) .map(Path::getFileName) .map(Path::toString) .peek(name -> { if (!isValidRuleKey(name)) { System.err.println("Ignoring non-rule-key file in profile directory: " + name); } }) .filter(ProfileJsonGenerator::isValidRuleKey) .sorted(Comparator.comparingInt(ProfileJsonGenerator::numericKey)) .collect(Collectors.toList()); ```` </details> </details> <details> <summary><b>Options</b> </summary> <kbd>Auto-apply is off</kbd> → Gitar will not commit updates to this branch.<br><kbd>Display: compact</kbd> → Showing less information. Comment with these commands to change: <table> <tr> <td><kbd>Auto-apply</kbd></td> <td><kbd>Compact</kbd></td> </tr> <tr> <td> ``` gitar auto-apply:on ``` </td> <td> ``` gitar display:verbose ``` </td> </tr> </table> </details> <sub>Was this helpful? React with 👍 / 👎 | [Gitar](https://gitar.ai)</sub>
These files are now generated during the Maven build from the profile directories (sonar_way/ and sonar_agentic_ai/), so they should not be tracked in git. The generated files are placed in target/classes/ during the build. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…rule not in sonar_way profile directory) and created diff_S6548.json (rule present in sonar_way profile directory after directory-based migration)
…5 to 468 to match the 468 files in the sonar_agentic_ai profile directory
…ry (they were incorrectly included during directory-based migration) and updated expected rule count from 468 to 466 in JavaAgenticWayProfileTest
…ory, which was not migrated during the switch from JSON-based to directory-based profile composition
…ory, which was not migrated during the switch from JSON-based to directory-based profile composition
63e5397 to
5a5b0c2
Compare





Summary
Testing
Summary by Gitar
ProfileJsonGenerator.javato automate the creation of profile JSON files during the build process.pom.xmlto include generated resource directories and configuredexec-maven-pluginto execute the generator.README.mdinsrc/main/resources/profiles/detailing the new rule management and build process.JavaAgenticWayProfileTestto reflect the change in the total count of active rules from465to466.MetadataTestto point to the generated profile resources intarget/classes/.This will update automatically on new commits.